candidate hypothesis
Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation
Lyu, Boxuan, Kamigaito, Hidetaka, Funakoshi, Kotaro, Okumura, Manabu
Maximum a posteriori decoding, a commonly used method for neural machine translation (NMT), aims to maximize the estimated posterior probability. However, high estimated probability does not always lead to high translation quality. Minimum Bayes Risk (MBR) decoding offers an alternative by seeking hypotheses with the highest expected utility. In this work, we show that Quality Estimation (QE) reranking, which uses a QE model as a reranker, can be viewed as a variant of MBR. Inspired by this, we propose source-based MBR (sMBR) decoding, a novel approach that utilizes synthetic sources generated by backward translation as ``support hypotheses'' and a reference-free quality estimation metric as the utility function, marking the first work to solely use sources in MBR decoding. Experiments show that sMBR significantly outperforms QE reranking and is competitive with standard MBR decoding. Furthermore, sMBR calls the utility function fewer times compared to MBR. Our findings suggest that sMBR is a promising approach for high-quality NMT decoding.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Asia > Singapore (0.04)
- (15 more...)
Black-box Generalization of Machine Teaching
Cao, Xiaofeng, Guo, Yaming, Tsang, Ivor W., Kwok, James T.
Hypothesis-pruning maximizes the hypothesis updates for active learning to find those desired unlabeled data. An inherent assumption is that this learning manner can derive those updates into the optimal hypothesis. However, its convergence may not be guaranteed well if those incremental updates are negative and disordered. In this paper, we introduce a black-box teaching hypothesis $h^\mathcal{T}$ employing a tighter slack term $\left(1+\mathcal{F}^{\mathcal{T}}(\widehat{h}_t)\right)\Delta_t$ to replace the typical $2\Delta_t$ for pruning. Theoretically, we prove that, under the guidance of this teaching hypothesis, the learner can converge into a tighter generalization error and label complexity bound than those non-educated learners who do not receive any guidance from a teacher:1) the generalization error upper bound can be reduced from $R(h^*)+4\Delta_{T-1}$ to approximately $R(h^{\mathcal{T}})+2\Delta_{T-1}$, and 2) the label complexity upper bound can be decreased from $4 \theta\left(TR(h^{*})+2O(\sqrt{T})\right)$ to approximately $2\theta\left(2TR(h^{\mathcal{T}})+3 O(\sqrt{T})\right)$. To be strict with our assumption, self-improvement of teaching is firstly proposed when $h^\mathcal{T}$ loosely approximates $h^*$. Against learning, we further consider two teaching scenarios: teaching a white-box and black-box learner. Experiments verify this idea and show better generalization performance than the fundamental active learning strategies, such as IWAL, IWAL-D, etc.
- North America > United States > New York (0.04)
- Asia > Singapore (0.04)
- Asia > Middle East > Jordan (0.04)
- Asia > China > Hong Kong (0.04)
- Education (1.00)
- Transportation > Air (0.84)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)
- Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Hypothesis Search: Inductive Reasoning with Language Models
Wang, Ruocheng, Zelikman, Eric, Poesia, Gabriel, Pu, Yewen, Haber, Nick, Goodman, Noah D.
Inductive reasoning is a core problem-solving capacity: humans can identify underlying principles from a few examples, which can then be robustly generalized to novel scenarios. Recent work has evaluated large language models (LLMs) on inductive reasoning tasks by directly prompting them yielding "in context learning." This can work well for straightforward inductive tasks, but performs very poorly on more complex tasks such as the Abstraction and Reasoning Corpus (ARC). In this work, we propose to improve the inductive reasoning ability of LLMs by generating explicit hypotheses at multiple levels of abstraction: we prompt the LLM to propose multiple abstract hypotheses about the problem, in natural language, then implement the natural language hypotheses as concrete Python programs. These programs can be directly verified by running on the observed examples and generalized to novel inputs. Because of the prohibitive cost of generation with state-of-the-art LLMs, we consider a middle step to filter the set of hypotheses that will be implemented into programs: we either ask the LLM to summarize into a smaller set of hypotheses, or ask human annotators to select a subset of the hypotheses. We verify our pipeline's effectiveness on the ARC visual inductive reasoning benchmark, its variant 1D-ARC, and string transformation dataset SyGuS. On a random 40-problem subset of ARC, our automated pipeline using LLM summaries achieves 27.5% accuracy, significantly outperforming the direct prompting baseline (accuracy of 12.5%). With the minimal human input of selecting from LLM-generated candidates, the performance is boosted to 37.5%. (And we argue this is a lower bound on the performance of our approach without filtering.) Our ablation studies show that abstract hypothesis generation and concrete program representations are both beneficial for LLMs to perform inductive reasoning tasks.
Adaptive Sample Selection for Hypothesis Falsification
Furlong, P. Michael (SGT Inc. and NASA Ames)
Current approaches to autonomous exploration focus on collecting observations in the absence of prior knowledge of the phenomena under investigation. However, it is unlikely that robots will arrive at planetary bodies without scientists having formed one or more hypotheses explaining data collected by precursor operations such as satellite images. These exploring robots collect observations to falsify the proposed hypotheses, incorporating those hypotheses can increase the efficiency of observation collection. This paper presents a novel algorithm, formulated in an exploration/exploitation framework, that directs robots to collect samples to determine which of a collection of hypotheses best explain data observed in situ by robots. We simulate a geologic exploration mission with a lander vehicle that can hop between locations of interest. This application is analogous to exploring of, e.g., the Aitken Basin of the south pole of Earth's Moon where sampling sites need to be separated hundreds or thousands of meters. We demonstrate that sampling algorithms aware of the hypotheses under investigation perform statistically significantly better than standard approaches, making more effective use of mission resources.
ILP-Based Reasoning for Weighted Abduction
Inoue, Naoya (Tohoku University) | Inui, Kentaro (Tohoku University)
Abduction is widely used in the task of plan recognition, since it can be viewed as the task of finding the best explanation for a set of observations. The major drawback of abduction is its computational complexity. The task of abductive reasoning quickly becomes intractable as the background knowledge is increased. Recent efforts in the field of computational linguistics have enriched computational resources for commonsense reasoning. The enriched knowledge base facilitates exploring practical plan recognition models in an open-domain. Therefore, it is essential to develop an efficient framework for such large-scale processing. In this paper, we propose an efficient implementation of Weighted abduction. Our framework transforms the problem of explanation finding in Weighted abduction into a linear programming problem. Our experiments showed that our approach efficiently solved problems of plan recognition and outperforms state-of-the-art tool for Weighted abduction.
- Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling > Plan Recognition (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Abductive Reasoning (1.00)